PatNet: A Lexical Database for the Patent Domain

نویسندگان

  • Wolfgang Tannebaum
  • Andreas Rauber
چکیده

In the patent domain Boolean retrieval is particularly common. But despite the importance of Boolean retrieval, there is not much work in current research assisting patent experts in formulating such queries. Currently, these approaches are mostly limited to the usage of standard dictionaries, such as WordNet, to provide synonymous expansion terms. In this paper we present a new approach to support patent searchers in the query generation process. We extract a lexical database, which we call PatNet, from real query sessions of patent examiners of the United Patent and Trademark Office (USPTO). PatNet provides several types of synonym relations. Further, we apply several query term expansion strategies to improve the precision measures of PatNet in suggesting expansion terms. Experiments based on real query sessions of patent examiners show a drastic increase in precision, when considering support of the synonym relations, US patent classes, and word senses.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effect of Log-Based Query Term Expansion on Retrieval Effectiveness in Patent Searching

In this paper we study the impact of query term expansion (QTE) using synonyms on patent document retrieval. We use an automatically generated lexical database from USPTO query logs, called PatNet, which provides synonyms and equivalents for a query term. Our experiments on the CLEF-IP 2010 benchmark dataset show that automatic query expansion using PatNet tends to decrease or only slightly imp...

متن کامل

Japanese-to-English Patent Translation System based on Domain-adapted Word Segmentation and Post-ordering

This paper presents a Japanese-to-English statistical machine translation system specialized for patent translation. Patents are practically useful technical documents, but their translation needs different efforts from general-purpose translation. There are two important problems in the Japanese-to-English patent translation: long distance reordering and lexical translation of many domain-spec...

متن کامل

Identification of BKCa channel openers by molecular field alignment and patent data-driven analysis

In this work, we present the first comprehensive molecular field analysis of patent structures on how the chemical structure of drugs impacts the biological binding. This task was formulated as searching for drug structures to reveal shared effects of substitutions across a common scaffold and the chemical features that may be responsible. We used the SureChEMBL patent database, which prov...

متن کامل

Constructing a Broad-coverage Lexicon for Text Mining in the Patent Domain

For mining intellectual property texts (patents), a broad-coverage lexicon that covers general English words together with terminology from the patent domain is indispensable. The patent domain is very diffuse as it comprises a variety of technical domains (e.g. Human Necessities, Chemistry & Metallurgy and Physics in the International Patent Classification). As a result, collecting a lexicon t...

متن کامل

دایره واژگانی کلمه، کلام و کتاب در مکتب عرفانی ابن‌عربی

The domains of mystic topics in related text greatly expanded in the late 6th and 7th centuries (A.H). This expansion led to the use of new lexical items for the expression of new concepts. One domain is the lexical domain used for the description of different levels of existence. Letter (harf), word (kalameh), discourse (kalām), book (ketāb), pen (qalam), etc. are parts of this domain. This wo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015